Virtual unrolling and information recovery from scanned scrolled historical documents
نویسندگان
چکیده
The objective of our work is to enable the reading of fragile scrolled historical parchments without the need to physically unravel them, thus providing valuable information to a wide range of scholarly disciplines. This problem has not been investigated by the computer vision community properly yet due to the need for parchment scanning technology: standard x-ray machinery is not sufficient as there is a requirement to extract out parchment ink in addition to the parchment’s underlying structure. Effective data recovery is also compromised as content from historical scrolled documents is inaccessible due to the deterioration of the parchment. We create a 3D volumetric model of a scrolled parchment’s underlying geometry and perform digital unwrapping of the parchment, producing a readable image of the text as an output. The proposed recovery framework consists of structure preserving anisotropic filtering in combination with robust segmentation, surface modelling and ink projection. We demonstrate with real examples how our algorithm is able to recover the underlying text and to solve the major challenge for scrolled parchment analysis, namely segmentation of connected layers and processing the data without user interaction.
منابع مشابه
Virtual unrolling and information recovery from scanned scrolled historical documents
The objective of our work is to enable the reading of fragile scrolled historical parchments without the need to physically unravel them, thus providing valuable information to a wide range of scholarly disciplines. This problem has not been investigated by the computer vision community properly yet due to the need for parchment scanning technology: standard x-ray machinery is not sufficient as...
متن کاملSAMKO: SEGMENTATION OF PARCHMENT SCROLLS FOR VIRTUAL UNROLLING 1 Segmentation of Parchment Scrolls for Virtual Unrolling
In this paper we introduce a framework for the segmentation of scanned scrolled parchments, based on a novel graph cut based approach with an additional shape prior, in combination with anisotropic diffusion and geometry-constrained postprocessing. This problem has not been investigated by the computer vision community properly yet due to the parchment scanning technology novelty, and is extrem...
متن کاملSegmentation of Parchment Scrolls for Virtual Unrolling
There is a critical need to access the valuable information in historical scrolls that cannot be read by conventional means. In some cases, their physical deterioration is at such an advanced state that any attempt to unravel the document manually would cause catastrophic fragmentation, destroying the internal information. Use of X-ray microtomography, a new direction in digital document analys...
متن کاملInformation Extraction based on the Concept of Geographic Context
State-of-the-art graphics recognition technologies for extracting geographic information from scanned map images are very labor intensive and do not scale well to process a large number of maps. Moreover, many historical scanned maps suffer from poor graphical quality due to bleaching of the original paper maps and archiving practices, and are ill-posed for traditional one-time training and rec...
متن کاملA Metadata Generation System for Scanned Scientic Volumes
Large scale digitization projects have been conducted at digital libraries to preserve cultural artifacts and to provide permanent access. The increasing amount of digitized resources, including scanned books and scientific publications, requires development of tools and methods that will efficiently analyze and manage large collections of digitized resources. In this work, we tackle the proble...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition
دوره 47 شماره
صفحات -
تاریخ انتشار 2014